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Abatraet. 

A  p$e%iofomt  is  a  graph  eadi  of  whose  connected  components  is  a  tree  or  a  tree  phis  an  edge;  a 
spsaamf  pseado/ofeet  of  a  gr^ih  contains  the  greatest  number  of  edges  possible.  This  paper  shows 
that  a  aunimum  cost  qiaaning  psendofMcst  ct  a  gra|di  with  n  rertkes  and  m  edges  can  be  found  in 
0(m-f  n)  time.  This  implies  that  a  imimnm  spanning  tree  can  be  found  in  0(m)  time  for  graphs 
with  girth  at  least  log^*^  n  for  some  constant  i. 
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1.  iBtrodvetlon. 


A  p$e%dotne  u  »  oMinected  gnidi  whh  etjntl  number  of  mtices  and  edges,  i.e.,  a  tree  plus 
an  edge  creating  a  cycle.  A  peeado/oieet  is  a  graidi  each  of  whoee  connected  components  has  at 
least  as  many  yertkes  as  edges,  i.e.,  eadi  component  is  a  tree  or  a  pseudotree.  Pseudoforests 
arise  m  many  applications  althou^  the  terminology  is  not  standard.  We  use  the  terminology 
0^  [PQ]>  which  uses  pseudoforests  to  compute  the  dennty  and  aibocidty  of  a  graidi;  see  [W]  for 
refinements  of  thb  approach.  Psendotrees  are  essentially  the  1-trees  used  in  (HK]  to  sdve  the 
trareling  salesman  problem.  The  directed  rersion  of  a  pseudoforest  is  called  a  functional  graph  in 
[Be],  nnce  it  cmTeq>onds  to  the  graph  of  a  finite  hmction.  For  this  reason  peeudoforests  commonly 
arise  in  paralld  processing,  when  each  processiv  diooses  a  successor  (e.g.,  [GPS]).  The  peeudoforests 
of  a  gra|di  form  the  hteireular  matroid,  whidi  is  important  in  the  study  of  rigidity  of  ba^and•body 
framewoihs  (WW).  In  the  problem  of  minimum  cost  network  flow  with  losses  and  gains  [L],  a  linear 
programming  basis  is  a  pseudoforest  [D].  A  pseudotree  is  also  called  a  unieyelie  graph  {e.g.,  Mfi]. 

With  these  applications  as  moriyatkm  we  propose  the  mtntmam  spanning  paeudoforeat  proilem: 
Consider  a  graph  G  with  n  yertkcs  and  m  edges.  A  pseudoforest  spans  G  if  it  has  the  greatest 
possible  number  of  edges.  Assume  every  edge  e  has  a  real-valued  cost  e(e).  The  cost  of  a  set  of 
edges  is  the  sum  of  all  its  edge  costs.  A  mnitmam  spanning  pseudoforest  has  the  smallest  cost 
possible.  This  paper  presents  an  algorithm  to  find  such  a  pseudoforest  in  time  0(m  +  n). 

The  pseudoforest  problem  rdates  to  fining  a  mmimnm  spinning  tree.  The  best-known  time 
for  finding  a  minimum  spamung  tree  m  0(mhffi(m,n))  (GGST),  where 

fi(m,n)  a  m{i|log^')  n  <  m/n). 

Here  log  denotes  logarithm  base  two,  and  log^*^  n  is  the  iterated  logarithm,  defined  by  log^’’^  n  s 
n,  log^**^*)  n  «  log(log^*l  n).  Note  that  if  m/n  ^  iog^*^  n  for  some  constant  t  then  fi(m,n)  <  i,  so 
the  time  to  find  a  mmimnw  spanning  tree  is  0(m).  Thb  paper  presents  a  related  result:  If  a  graph 
has  girth  at  least  bg^*)  n  for  some  constant  i  then  a  minimum  spanning  tree  can  be  found  in  0(m) 
time. 

Section  2  presents  the  results.  Thb  seetkm  closes  with  definitions  and  backgronnd  bom  gra|di 
theory  and  data  structures. 

If  5  b  a  set  and  e  an  dement,  5  +  e  denotes  5  U  (e)  and  S~e  denotes  5  -  (e).  For  a  graph 
G,  V(G)  and  E{G)  denote  its  vertex  set  and  edge  set,  reqMctivdy.  Hence  for  the  grroi  graph  G, 
n  « {V(G)|  and  m  « ii?(G)|.  An  edge  e  b  ineident  to  a  snbgr^ih  S  if  one  or  both  ends  b  in  V{H) 
hat  eiE{B). 


A  tree  {pie%iotne)  component  of  a  gr^  Gun  connected  component  of  G  that  »  a  tree 
(pteodotiee).  A  spanning  paeudofonst  P  for  a  gra|di  G  consists  of  every  tree  component  of  G,  phis 
for  every  other  connected  component  C  of  G,  one  or  more  psendotree  components  that  partition 
V(C).  Note  that  P  contains  exactly  |V(C)|  edges  of  C. 

Tile  set  merpinp  pnUem  (T)  is  to  maintain  a  coUeetkm  of  disjoint  sets  winch,  after  initialization, 
is  sobject  to  two  operations: 

•mte{S,S')—  form  a  new  set  5  U5',  thereby  destroying  seU  5  and  5'; 

find{c) —  letnm  the  name  of  the  set  containmg  element  e. 

Hie  set  merging  algorithm  used  m  Section  2  is  «mon  hp  eue.  It  represents  each  set  S  by  a  vtuon 
tree,  i^.,  a  tree  whose  nodes  are  the  elements  of  5.  A  snite  makes  the  root  of  the  smaller  tmion  tree 
a  child  of  the  root  of  tiie  larger  An  operation  ^nd(«)  is  done  fay  following  the  path  in  the  onion 
tree  from  v  to  the  root.  (No  path  compression  is  done).  Hence  a  sntte  operation  is  0(1)  time  and 
finHy)  b  0(log  s),  where  s  b  the  sbe  of  the  set  containing  «. 

In  thb  paper  a  priontp  yaese  b  a  data  stmctoie  on  a  oniverse  that  b  partitioned  into  dbjoint 
gseaes,  where  each  element  has  a  ieal*valned  cost,  and  after  initialization  the  fidlowing  operations 
can  be  performed: 

meU(Q,(yy~  form  a  new  qpeae  by  comlmiinf  Q  and  O',  therdiy  destroying  queues 
g  tad  O'; 

findjnin{Q}—  retom  the  smallest  cost  element  in  queue  Q; 

ielete{e,Q)—  remove  dement  e  from  queue  Q. 

The  algoriUim  used  in  Section  2  implements  priority  qoeoes  with  Filxmacci  heaps  [FT].  The  follow* 
ing  ime  bounds  hold:  meU  b  0(1);  findjmn(Q)  b  O(logs),  where  s  b  the  size  of  0;  dc/ctc(c,0) 
b  O(logs),  where  s  b  the  size  of  the  Fibrnued  tree  containing  e.  Note  these  are  amortized  time 
bounds.  Also  to  achieve  the  bound  for  delete  the  algorithm  of  [FT]  b  modified  slid^tly,  making 
it  lazier  Unlike  (FT)  a  queue  does  not  keep  track  of  its  mmimum  element.  Rather  find.min(Q) 
Imka  trees  of  Q  untU  there  b  at  most  one  tree  each  rank,  and  then  6nds  and  returns  the  desired 
lamiimm-  delete(e,Q)  cuts  e  from  its  parent  and  adds  the  diildien  of  e  to  the  Ibt  of  trees  of  Q. 
The  analysb  of  [FT]  easily  extends  to  prove  the  above  time  bounds.  (The  same  time  bounds  can 
be  achieved  using  binomial  queues  |Br]  modified  to  do  lazy  mdding). 

3.  The  algorithm. 

The  algorithm  b  based  on  a  locality  prqicrty  dmilar  to  one  possessed  fay  minimum  spanning 
trees  (T). 


Lemmt  2.1.  Let  F  be  a  subgrq^  of  a  mmimnwi  jpaoning  paeodoforest.  Let  e  be  a  smallest  cost 
edge  incident  to  some  tree  component  T  of  P.  Then  P  +  e  is  a  subgraidi  of  a  minimum  spanning 
psendoforest. 

Proof.  Let  P*  be  a  minimum  q>anning  psendoforest  containing  P,  and  su{^ose  P*  does  not 
contain  e.  Let  /  be  an  edge  of  P*  that  is  incident  to  T  such  that  the  component  of  P*  -  / 
containing  T  is  a  tree  (Spedfically  if  T  is  in  a  tree  component  of  P  then  /  is  an  edge  of  P  inddent 
to  T;  if  T  is  in  a  pseudotree  component  with  cycle  C,  then  /  b  an  edge  of  P  incident  to  T  on  C 
or  <m  the  path  from  T  to  C).  By  definition,  e(e)  <  e(f).  Hence  P*  -  /-f  e  is  the  desired  minimom 
q*»nnwig  psendoforest.  I 

The  algorithm  enlarges  a  snbgr^di  P  to  a  wmiiiMnii  pseudoforest.  For  efficiency  it 

grows  the  components  of  P  at  approximately  the  same  rate.  M«c  precisely  let  d(v)  denote  the 
degree  of  vertex  v  in  the  given  graph  G;  the  (tots/)  de/ree  of  a  subgraph  ff  is  ^{d(v)|v  €  V{n)). 
The  algorithm  grows  components  so  that  they  have  similar  degrees.  The  details  are  as  folhnrs. 

The  algorithm  initialites  P  to  contain  every  vertex  v  of  (7  («  is  initially  a  tree  component  of 
P).  It  then  repeats  the  following  step  as  long  as  P  contains  a  tree  compmicnt  with  an  incident 
edge: 

Enlaryinf  Step.  Choose  a  tree  component  T  of  smallest  degree  and  add  to  P  a  minimum  cost  edge 
inddent  to  7. 

Correctness  of  this  algorithm  follows  firom  the  lemma;  clearly  pseudofo^  P  spans  G  when 
the  algorithm  halts. 

The  oilarging  step  is  implemented  ariUi  the  fdlowing  data  structures.  A  set  merging  data 
structure  maintains  the  partition  of  V(G)  induced  by  the  components  of  P.  Each  compement  of  P 
is  marked  as  a  tree  or  pseudotree.  Each  tree  component  7  maintains  its  degree  d(7),  and  a  priority 
queue  of  incident  edges  Q(7),  ordered  by  cost.  An  edge  can  be  in  two  pciority  queues,  in  which 
case  the  two  occurrences  are  linked  by  pointers.  There  is  an  array  C(1..2m],  where  C{d)  points  to 
a  doubly-finked  fist  of  aO  tree  components  of  degree  i  with  an  incident  edge. 

With  this  data  structure  the  enlargmg  step  works  as  follows:  The  outermost  loop  examines 
the  entries  m  C  in  increasing  order  to  find  the  next  smallest  tree  component  7.  7  is  removed 
bum  its  C-list.  The  smallest  edge  e  in  QiX)  is  obtained  using  findjnin.  The  set  merging  data 
structure  finds  the  two  components  containing  the  ends  of  e,  say  7  and  5.  If  S  »  7  it  is  marked 


as  a  pseudotree.  U  S  then  sets  VIS)  and  V{T)  are  anite^  farther  if  5  is  a  tree  it  is  deleted 
from  its  C'list,  e  is  deleted  from  Q{S)  ard  Q(r),  these  qaeoes  are  melded,  the  new  tree  component 
5  U  T  sets  degree  6  s  d{»)  +  d{t)  and  is  added  to  the  list  C[j]  if  its  qoene  is  nonempty.  Finally  in 
all  cases,  e  is  added  to  P. 

To  estimate  the  time,  note  that  all  initialization  uses  0(fn+n)  time.  The  time  for  all  enlarging 
steps,  excluding  priority  queue  findjniiu  and  deletes  and  set  merging  finds,  is  0(m*f  n).  Tb  estimate 
the  time  for  findjmins,  deletes  and  finds,  dehne  the  rani  of  a  component  C  as 

f(C)«llogd(C)J. 

A  simple  induction  shows  that  when  T  is  chosen  in  the  enlarging  step,  the  size  of  any  Fibonacci  tree 
is  at  most  d(T)  (recall  that  findjnin  is  the  cmly  operatitm  that  enlarges  Fibonacci  trees;  initially 
erery  edge  is  in  its  own  Fibonacci  tree).  A  similar  induction  shows  that  when  T  is  chosen  the 
hei^t  of  the  union  tree  for  any  component  C  is  at  moot  min{r(C),  1  +  r(T)}  (since  T’s  bright  is  at 
most  r(T)).  Thus  the  findjnin,  find  and  two  deletes  for  T  take  time  0()og  d(T)  *f  r(T))  «  0(r(T)). 
Let  T  (r)  denote  the  set  of  all  rank  r  tree  components  diooen  as  T  in  the  enlarging  step.  Then  the 
total  findjnin,  delete  and  find  time  is  at  most  a  constant  times 

EriTWI- 

For  any  rank  r,  any  edge  is  counted  in  the  degree  of  at  most  two  trees  of  T  (r)  (since  the  enlarging 
step  unites  T  into  a  pseudotree  or  increases  the  rank  of  the  component  containing  T).  Hence 
T,{^(T)\r  €  T(r))  <  2m.  Any  T  €  T(r)  has  d(T)  >  2'.  Thus  )T(f)|  <  m/2'-*.  This  impBes  the 
total  time  b  at  most  a  omstant  times  rm/2'-*  *  0(m). 

Theorem  3.1.  A  minimum  q>anning  poendofbiest  can  be  found  in  time  0(m  -f  n).  I 

Now  we  turn  to  the  lainiinniii  spanning  tree  problem.  Let  P  be  a  minimniB  spanning  pseudo- 
forest.  Fbrm  a  set  C  by  choosbg  a  maximum  cost  edge  from  each  cycle  of  P. 

Lemma  3.3.  P -Cun  subgraph  of  a  iBrniniiif  spanning  tree. 

Proof.  Let  r  be  a  minimum  spanning  tree  with  as  many  edges  ct  Pen  possible.  Suppose  P-C 
b  not  a  subgr^  of  T.  Let  Q  be  a  component  of  (P  -  C)  n  T  that  b  not  a  component  ed  P-C; 
dwooe  Q  so  it  b  not  incident  to  an  edge  of  C.  Let  e  be  an  edge  of  P  inddent  to  Q  such  that  the 
eompoiMBt  at  P-e  contahihig  Q  b  a  tree  (e  b  found  as  in  Lemma  2.1).  Let  / be  an  edge  incident 
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to  Q  in  the  fonUmental  qrck  of  e  in  T  (/  exists  since  e^TuC).  Tlien  F  -  e  +  /  b  x  spanning 
pseodoforest,  whence  e(e)  <  c(f).  T  -  /+ e  b  a  spanning  tree  containing  more  edges  of  F  than  T, 
whence  e(f)  <  e{e).  Thb  c(nitradicti<»  proves  the  lemma.  I 

The  lemma  justifies  the  following  mmimtim  spanning  tree  algorithm.  Find  a  minimum  spanning 
psendoforest  F.  Form  the  forest  F  by  deleting  a  maximum  cost  edge  from  each  cycle  of  F;  form 
the  gra|di  G*  hy  contracting  each  tree  of  F  to  a  vertex.  Find  a  minimum  spanning  tree  T  of  G*. 
Now  r  U  F  b  a  miniwiniw  spanning  tree  of  (?. 

Thb  algorithm  improves  the  bound  for  imtHiiim  spanning  trees  in  the  following  special  case. 
For  the  improvement  it  suffices  to  find  T  using  the  mmimnni  spanning  tree  algorithm  of  [FT],  which 
uses  time  0(m^(m,n))  but  b  sli^tly  simpler  than  |GGST).  Recall  the  §irtk  g  <Az  graph  b  the 
length  of  a  shortest  cycle  [Hj. 

Theorem  2.2.  Let  G  be  a  graph  with  girth  g  >  log^*^  n  for  some  cmstant  t.  Then  a  minimum 
spanning  tree  of  G  can  be  found  in  time  0(m). 

Proof.  Ehuept  for  finding  T,  the  algorithm  uses  linear  time.  Let  n'  «  |V(G')|,  m'  *  |£(G'))» 
so  T  b  found  in  time  0(m'/9(m',n')).  Clearly  n'  <  nfg  and  <  m.  Note  that  mP{m,n)  b  an 
increasing  function  of  m  (since  ^(m,  n)  <  n  and  fi{m  +  l,n)  >  ^(fn,n)  ~  1).  Hence  < 

m^(m,n')  <  m0{m^nlg).  Since  m  >  n,  ^(m,n/g)  <  fi{n,nlg)  <  fi(ng,n).  Since  g  ^  log^*^n, 
P{ng,n)  <  t  by  definitioiL  Thb  gives  the  thewem.  I 

In  conclttsiain,  a  itnimnin  spanning  tree  can  be  found  in  linear  time  if  the  graph  has  density 
or  tilth  at  leasv  log^'^  n.  Hus  narrows  the  open  case  down  to  graphs  that  are  extremdy  sparse. 
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